Digit DP

Digit Dynamic Programming

Did you ever encounter the problem of answering questions of the form: How many numbers $x$ in the in the range $[a,b]$ fulfill a certain property $f(x)$ which depends on the digits of the number. For example, how many numbers in the range $[1,10^6]$ have a prime number as their digit sum? If yes, then you'll find here a technique with which you can answer such queries even for very big ranges.

Simple solution

If $a,b$ in the above problem are small ($\leq 10^7$), a straightforward solution would be to just iterate over all numbers and increment a counter everytime the property is fulfilled:

bool pred(long long number) {
    long long digitSum = 0LL;

    while (number > 0) {
        digitSum += (number % 10);
        digitSum /= 10;
    }

    return is_prime(digitSum);
}
    
long long how_many(long long a, long long b) {
    long long counter = 0LL;    
    for (long long x = a; x <= b; ++x) {
            if (pred(x)) {
                counter++;
            }
        }
    return counter;
}

The for loop suggests that this is a linear algorithm, but it's actually exponential $\mathcal{O}(10^n)$ in the number of digits of $d := b-a$. For $a=1, b=10^{200}$ it'd take too long. So there must be a better solution...

Key concecpt and states

The key idea for an efficient solution is to treat the number not as a number but as a string of digits. If we then iterate in some way over the digits and combine our results from one iteration with the results from the previous iteration, we can reduce the time complexity to linear in the number of digits (which is logarithmtic in the number itself). How can we iterate over the digits? Well, we just need to iterate over every suffix until we reach the "length" of the number (which is $\lfloor \log_{10}{ \text{digits}(b-a) } \rfloor) + 1$. Even for the case of $a = 1, b = 10^{200}$ we would only need $\lfloor \log_{10}{ 10^{200} } \rfloor) + 1 = 201$ iterations over the suffixes.

Ok, so let's define a dynamic programming state $\text{dp}$. We will need to keep track of our suffix length $i$ and of the sum of digits $s$ we already encountered. Because our upper limit will be $10^{200}$ we'll need $201$ different states for the suffixes and $9 * 201 = 1809$ states for possible sums (as the maximum digit sum can be a number consisting of only $9$'s and the minimum is $0$).

const int MAX_NUM_LENGTH = 203;
const int MAX_DIGIT_SUM = 9 * MAX_NUM_LENGTH;

long long dp[MAX_NUM_LENGTH][MAX_DIGIT_SUM];

State transitions

The next step in our solution is to determine the transitions in our DP table. First, we have exactly one empty suffix with sum $0$, so $\text{dp}[0][0] = 1;$. If we then have a suffix of length $i$, then it's digit sum can be any digit sum from the previous suffix of length $i-1$ plus the current digit $d$. Thus, we get following relation: $$ dp[i][s] = \sum_{d=0}^{9} dp[i-1][s-d]; $$ This thought leads to the following implementation which prints the result in the case $[1,10]$ which is 2,3,5,7, so we get... 37? Why that?


// includes ...
const int MAX_NUM_LENGTH = 201;
const int MAX_DIGIT_SUM = 9 * MAX_NUM_LENGTH;
long long dp[MAX_NUM_LENGTH][MAX_DIGIT_SUM];

bool is_prime(long long n) {return // ...;}

long long how_many(string& number) {
    const int n = number.size();
    dp[0][0] = 1;
    for (int i = 1; i <= n; ++i) {
        for (int d = 0; d < 10; ++d) {
            for (int s = d; s <= MAX_DIGIT_SUM; ++s) {
                dp[i][s] += dp[i-1][s-d];
            }
        }
    }
    long long primeDSum = 0;
    for (int i = 1; i <= MAX_DIGIT_SUM; ++i) {
        if (is_prime(i)) {
            primeDSum += dp[n][i];
        }
    } 
    return primeDSum;
}

int main() {
    string s("10");
    long long result = how_many(s);
    cout << result << "\n";
    return 0;
}

Bounding state

At this point, you may be wondering why we get $37$ instead of $4$. The reason is simple: Instead of calculating how many numbers $\leq 10$ have a prime digit sum, we calculated how many numbers WITH TWO DIGITS have prime digit sum, which is, after a quick check, exactly $37$. In order to incorporate an upper bound, we somehow need an additional state. Let's call it $\text{b}$ for bounding. If $b = 0$, we will consider unbounded suffixes (like before), so nothing changes. We can just copy everything from above, but now with three dimensions ($\text{dp}[i][s][b]$): $$ \begin{aligned} dp[0][0][0] &= 1 \\ dp[i][s][0] &= \sum_{d=0}^{9} dp[i-1][s-d][0]; \end{aligned} $$

For the bounded case, we the same base case holds, so: $$ dp[0][0][1] = 1 $$ However, in the general case we need to be careful. Let $d_i$ be the ith lowest order digit in our initial number. Then, for each digit $d < d_i$, we can use the unbounded suffix as the resulting number will be strictly less than our upper bound. However, for $d = d_i$, we can only use bounded suffixes, because otherwise we would count elements outside the range. In total, we get: $$ dp[i][s][1] = \left( \sum_{d=0,d < d_i} dp[i-1][s-d][0] \right) + dp[i-1][s-d_i][1] $$ Incorporating this gives the final code and the answer for the range $[1,10^{200}]$ is $7726631918904331771$.


// includes ...
const int MAX_NUM_LENGTH = 203;
const int MAX_DIGIT_SUM = 9 * MAX_NUM_LENGTH;
long long dp[MAX_NUM_LENGTH][MAX_DIGIT_SUM][2];

bool is_prime(long long n) {return // ...;}
    
long long how_many(string& number) {
    const int n = number.size();
    dp[0][0][0] = 1;
    dp[0][0][1] = 1;
    for (int i = 1; i <= n; ++i) {
        // ith lowest order digit is at n-i
        int currDigit = number[n-i] - '0';
        for (int d = 0; d < 10; ++d) {
            for (int s = d; s <= 9*n; ++s) {
                // Bounded case    
                if (d < currDigit) {
                    dp[i][s][1] += dp[i-1][s-d][0];
                } else if (d == currDigit) {
                    dp[i][s][1] += dp[i-1][s-d][1];
                }
                // Unbounded
                dp[i][s][0] += dp[i-1][s-d][0];
            }
        }
    }
    long long primeDSum = 0;
    for (int i = 0; i < MAX_DIGIT_SUM; ++i) {
        if (is_prime(i)) {
            // We are only interested in bounded suffixes
            primeDSum += dp[n][i][1];
        }
    }
    return primeDSum;
}

int main() {
    
    string s("13");
    long long result = how_many(s);
    cout << result << "\n";

    return 0;
}